\section{Funk SVD}
The idea behind Funk-SVD is to use fast gradient decent optimization to be able to do SVD with missing values, in this case missing ratings. First the method squeezes out un-informative information from the matrix. What is left is the most valuable information, which will be used to make predictions more reliably. The Funk-SVD created a new matrix of users, where a user is now characterized by his rating of movie-genres instead of ratings of each individual movie. This matrix is called B. Similarly a movie was before represented by a user-rating relationship, which has been processed into a movie-genre representation. This matrix is called A.

A and B are trained to minimize errors for the observed rating. The product of A  and B will have filled in all the non-rated movie-user pairs, because it has implicitly used the information from similar users and movies to generate ratings which were missing before.